162 research outputs found
On the asymptotic optimality of greedy index heuristics for multi-action restless bandits
The class of restless bandits as proposed by Whittle (1988) have long been known to be intractable. This paper presents an optimality result which extends that of Weber and Weiss (1990) for restless bandits to a more general setting in which individual bandits have multiple levels of activation but are subject to an overall resource constraint. The contribution is motivated by the recent works of Glazebrook et al. (2011a), (2011b) who discussed the performance of index heuristics for resource allocation in such systems. Hitherto, index heuristics have been shown, under a condition of full indexability, to be optimal for a natural Lagrangian relaxation of such problems in which a resource is purchased rather than constrained. We find that under key assumptions about the nature of solutions to a deterministic differential equation that the index heuristics above are asymptotically optimal in a sense described by Whittle. We then demonstrate that these assumptions always hold for three-state bandits
Developing effective service policies for multiclass queues with abandonment:asymptotic optimality and approximate policy improvement
We study a single server queuing model with multiple classes and impatient customers. The goal is to determine a service policy to maximize the long-run reward rate earned from serving customers net of holding costs and penalties respectively due to customers waiting for and leaving before receiving service. We first show that it is without loss of generality to study a pure-reward model. Since standard methods can usually only compute the optimal policy for problems with up to three customer classes, our focus is to develop a suite of heuristic approaches, with a preference for operationally simple policies with good reward characteristics. One such heuristic is the Rμθ rule—a priority policy that ranks all customer classes based on the product of reward R, service rate μ, and abandonment rate θ. We show that the Rμθ rule is asymptotically optimal as customer abandonment rates approach zero and often performs well in cases where the simpler Rμ rule performs poorly. The paper also develops an approximate policy improvement method that uses simulation and interpolation to estimate the bias function for use in a dynamic programming recursion. For systems with two or three customer classes, our numerical study indicates that the best of our simple priority policies is near optimal in most cases; when it is not, the approximate policy improvement method invariably tightens up the gap substantially. For systems with five customer classes, our heuristics typically achieve within 4% of an upper bound for the optimal value, which is computed via a linear program that relies on a relaxation of the original system. The computational requirement of the approximate policy improvement method grows rapidly when the number of customer classes or the traffic intensity increases
Assessing an intuitive condition for stability under a range of traffic conditions via a generalised Lu-Kumar network
We argue the importance both of developing simple sufficient conditions for the stability of general multiclass queueing networks and also of assessing such conditions under a range of assumptions on the weight of the traffic flowing between service stations. To achieve the former, we review a peak-rate stability condition and extend its range of application and for the latter, we introduce a generalisation of the Lu-Kumar network on which the stability condition may be tested for a range of traffic configurations. The peak-rate condition is close to exact when the between-station traffic is light, but degrades as this traffic increases.Multiclass queueing networks, stability, fluid model, Lu-Kumar network
On the identification and mitigation of weaknesses in the Knowledge Gradient policy for multi-armed bandits
The Knowledge Gradient (KG) policy was originally proposed for online ranking and selection problems but has recently been adapted for use in online decision making in general and multi-armed bandit problems (MABs) in particular. We study its use in a class of exponential family MABs and identify weaknesses, including a propensity to take actions which are dominated with respect to both exploitation and exploration. We propose variants of KG which avoid such errors. These new policies include an index heuristic which deploys a KG approach to develop an approximation to the Gittins index. A numerical study shows this policy to perform well over a range of MABs including those for which index policies are not optimal. While KG does not make dominated actions when bandits are Gaussian, it fails to be index consistent and appears not to enjoy a performance advantage over competitor policies when arms are correlated to compensate for its greater computational demands
Applications of stochastic modeling in air traffic management:Methods, challenges and opportunities for solving air traffic problems under uncertainty
In this paper we provide a wide-ranging review of the literature on stochastic modeling applications within aviation, with a particular focus on problems involving demand and capacity management and the mitigation of air traffic congestion. From an operations research perspective, the main techniques of interest include analytical queueing theory, stochastic optimal control, robust optimization and stochastic integer programming. Applications of these techniques include the prediction of operational delays at airports, pre-tactical control of aircraft departure times, dynamic control and allocation of scarce airport resources and various others. We provide a critical review of recent developments in the literature and identify promising research opportunities for stochastic modelers within air traffic management
The Mass Assembly Histories of Galaxies of Various Morphologies in the GOODS Fields
We present an analysis of the growth of stellar mass with cosmic time
partitioned according to galaxy morphology. Using a well-defined catalog of
2150 galaxies based, in part, on archival data in the GOODS fields, we assign
morphological types in three broad classes (Ellipticals, Spirals,
Peculiar/Irregulars) to a limit of z_AB=22.5 and make the resulting catalog
publicly available. We combine redshift information, optical photometry from
the GOODS catalog and deep K-band imaging to assign stellar masses. We find
little evolution in the form of the galaxy stellar mass function from z~1 to
z=0, especially at the high mass end where our results are most robust.
Although the population of massive galaxies is relatively well established at
z~1, its morphological mix continues to change, with an increasing proportion
of early-type galaxies at later times. By constructing type-dependent stellar
mass functions, we show that in each of three redshift intervals, E/S0's
dominate the higher mass population, while spirals are favored at lower masses.
This transition occurs at a stellar mass of 2--3 times 10^{10} Msun at z~0.3
(similar to local studies) but there is evidence that the relevant mass scale
moves to higher mass at earlier epochs. Such evolution may represent the
morphological extension of the ``downsizing'' phenomenon, in which the most
massive galaxies stop forming stars first, with lower mass galaxies becoming
quiescent later. We infer that more massive galaxies evolve into spheroidal
systems at earlier times, and that this morphological transformation may only
be completed 1--2 Gyr after the galaxies emerge from their active star forming
phase. We discuss several lines of evidence suggesting that merging may play a
key role in generating this pattern of evolution.Comment: 24 pages, 1 table, 8 figures, accepted for publication in Ap
The achievable region approach to the optimal control of stochastic systems
The achievable region approach seeks solutions to stochastic optimisation problems by: (i) characterising the space of all possible performances (the achievable region) of the system of interest, and (ii) optimising the overall system-wide performance objective over this space. This is radically different from conventional formulations based on dynamic programming. The approach is explained with reference to a simple two-class queueing system. Powerful new methodologies due to the authors and co-workers are deployed to analyse a general multiclass queueing system with parallel servers and then to develop an approach to optimal load distribution across a network of interconnected stations. Finally, the approach is used for the first time to analyse a class of intensity control problems.Achievable region, Gittins index, linear programming, load balancing, multi-class queueing systems, performance space, stochastic optimisation threshold policy
Modeling and analysis of uncertain time-critical tasking problems
Naval Research Logistics, 53 , No. 6, (Sept. 2006), 588-599.This paper describes modeling and operational analysis of a generic asymmetric services-system situation in which (a) Red agents, potentially threatening, but in another but important interpretation, are isolated friendlies, such as downed pilots, that require assistance and "arrive" according to some partially known and potentially changing pattern in time and space: and (b) Reds have effectively limited unknown deadlines or times of availability for Blue service, i.e., detection, classification, and attack in a military setting or emergency assistance in others. We discuss various service options by Blue service agents and devise several approximations allowing one to compute efficiently those proportions of tasks of different classes that are successfully serviced, or more generally, if different rewards are associated with different classes of tasks, the percentage of the possible reward gained. We suggest heuristic policies of a Blue server to select the next task to perform and to decide how much time to allocate to that service. We discuss this for a number of specific examples
A Classical Search Game In Discrete Locations
Consider a two-person zero-sum search game between a hider and a searcher. The hider hides among n discrete locations, and the searcher successively visits individual locations until finding the hider. Known to both players, a search at location i takes ti time units and detects the hider—if hidden there—independently with probability αi, for i = 1,...,n. The hider aims to maximize the expected time until detection, while the searcher aims to minimize it. We prove the existence of an optimal strategy for each player. In particular, any optimal mixed hiding strategy hides in each location with a nonzero probability, and there exists an optimal mixed search strategy which can be constructed with up to n simple search sequences
- …